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ABSTRACT 


Use-definition  chaining  is  a  well  known  pragmatic 
technique  for  solving  various  classes  of  data  flow 
analysis  problems.   In  this  paper  we  compare  this  method 
with  the  standard  flow  graph  method  for  solving  such 
problems  and  show  that  when  applied  to  the  class  of 
attribute  flow  analysis  problems  (which  includes  constant 
propagation,  type  analysis,  etc.)   both  methods  yield 
the  same  solution,  but  that  the  use  definition  method 
is  generally  much  more  efficient  than  the  flow  graph 
method.   We  also  describe  and  analyze  a  possible 
adaptation  of  the  use-definition  technique  to  inter- 
procedural  attribute  flow  analysis. 


IV 


1.    Introduction 

Data  flow  analysis  of  computer  programs  has  become  a 
key  tool  for  program  analysis  and  optimization.  However, 
a  significant  portion  of  the  extensive  literature  on  this 
subject  concerns  itself  with  abstract  theory   and  methods, 
and  the  issue  of  practical  implementation  of  these   abstract 
methods  does  not  always  receive  adequate  treatment.  As  the 
role  of  code  optimization  tends  to  become  more  central 
in  compiler  design  and  implementation  (especially  in  the 
development  of  progressively  higher  level  programming 
languages),  more  thought  will  need  to  be  given  to  the 
efficiency  of  existing  data  flow  algorithms  and 
methods . 

This  paper  aims  to  narrow  the  gap  between  theory  and 
practice  by  providing  a  theoretical  study  of  the  well  known, 
but  little  analyzed  data  flow  method  of  attribute  propaga- 
tion along  definition-use  links  ( [Al^ ] ;  see  below  for 
detailed  definitions).   This  method  propagates  data  attributes 
from  variable  definitions  directly  to  variable  uses, 
rather  than  propagating  these  attributes  along  flow  graph 
edges   between  basic  blocks  (for  which  approach  see  e.g.  [He]). 
We  will  compare  these  two  methods  for  an  important  class 
of  attribute  flow  problems,  in  which  attributes  such  as 
value ,  type  and  range  are  to  be  computed  per  each  variable 
occurrence.   We  will  show  that  both  techniques  yield  the 


same  solution   when  applied  to  such  an  analysis  problem,  but 
that  in  general  the  use  definition  method  is  much  more 
efficient  and  requires  much  less  space  and  time  than  the 
flow  graph  method.   Our  analysis  will  allow  us  to  relate 
various  known  flow  graph  based  algorithms  for  performing 
analyses  of  this  class   to  more  efficient  algorithms 
based  on  definition -use  propagation.   We  note  that 
several  recent  data  flow  algorithms  (cf.  e.g.  [JM,],  [KaU] 
[JM-],  [Ka] )   are  usually  so  prohibitively  expensive 
in  their  space  (and  time)  requirements  as  to  be  impractical 
for  use  in  a  pragmatic  general  purpose  optimizer 
unless  they  are  reformulated  in  terms  of  the  use- 
definition  technique.   Indeed,  in  reviewing  recently  published 
attribute  flow  analysis  algorithms,  one  can   usually  distinguish 
easily  between  those   that  have  been  devised  for  pragmatic 
.application  (cf.  [Te],[Ha])   and  employ  the  use-definition 
technique,  and  those  that  have  not  (cf.  [JxM,],  [Ka]  ,  [KaU]). 

Relevant  notation  and  terminology  will  be  introduced 
in  section  2 ,  in  which  we  will  describe  the  two  methods 
mentioned  above  in  some  detail,  including  data  flow 
frameworks  to  be  used,  data  flow  equations  to  be  solved 
and  the  relation  of  their  (maximal   fixpoint)  solution 
to  the  actual  variable  attributes  to  be  computed.  In 
section  3  we  will  prove  equivalence  between  these  two 
methods.   Our  main  theorem  (Theorem  3.5)  shows  that  the 
application  of  the  flow  graph  based  method  to  an  attri- 
bute flow  problem,  which  yields  attribute  information 
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at  entry  to  each  basic  block,  and  the  local  propagation 
of  this  information  through  the  block  assign   to  each 
variable  occurrence  an  attribute  identical  to  that 
that  would  have  been  obtained  by  applying  the  use- 
definition  propagation  technique.   The  proof  is  more 
complicated  than  might  be  expected,  due  to  the  fact  that 
the  data  flow  frameworks  of  most  important  attribute  flow 
analyses  are  not  distributive  (cf.  [He]).   In  the 
distributive  case ,  a  much  shorter  proof  of  this  theorem 
which  would  also  show  that  the  two  methods  yield 
the  'meet  over  all  paths'  solution  ( [Ki] ,  [He])  can  be 
given.   However,  since  for  nondistributive  frameworks 
this  second  solution  is  not  recursively  calculable,  we 
must  give  a  more  complex  proof,  and  also  content 
ourselves  with  showing  that  both  the  flow  graph  and  the 
use-definition  techniques   attain  the  same  (possible) 
underestimate   of  the  'meet  over  all  paths' 
solution. 

Section  4  will  then  discuss  the  efficiency  of  the 
use -definition  approach  and  will  note  several  other 
pragmatically  useful  features  of  the  use -definition  map. 
Section  5  considers  adaptation  of  the  use-definition 
method  to  interprocedural  analysis  and  discusses  various 
problems  arising  in  this  case. 


2 .    Terminology  and  Notation 

This  section  introduces  relevant  notation   and 
terminology.   The  reader  is  referred  to  [He]  and  [AU] 
for  basic  terminology  concerning  data-flow 
analysis. 

The  class  of  data  flow  analyses  considered  in  this 
paper  is  the  class  of  attribute-flow  analyses,  in  which 
we  wish  to  compute  a  certain  attribute  (value,  type, 
range   etc.)   for  each  variable  occurrence  in  a  program  P. 
This  class  of  analyses  includes  constant  propagation 
analysis  [Ki] ,  in  which  the  required  attribute  is  a  value, 
type  analysis  (  [Te]  ,  [ JM,  ] , [KaU] )  ,  and  range  analysis  [Ha]. 

A  data  flow  framework  for  an  attribute  flow 
analysis  of  the  class  which  interests  us  can  be  defined 
as  follows:   Let  L   denote  a  lattice  of  attributes  that 
can  be  acquired  by  program  variables.   Let  Z  denote  the 
set  of  all  the  variables  occurring  in  the  program  P,  which 


is  to  be  analyzed.   Then  the  lattice  L  used  in  our  analysis 

^0 


is  defined  to  be  L^  ,  i.e.  the  set  of  all  maps  from  E  to  L   , 


with  meet  defined  on  S  in  a  pointwise  manner.  We  assume 
that  Lq  is  a  well  founded  lattice.   This  makes  L  a  well 
founded  lattice,  since  I    is  always  a  finite  set.  Let 
^  ^  Lq   be  a  largest  element,  indicating  an  undefined 
attribute  (i.e.  attribute  of  undefined  (uninitialized) 
variables),  and  let   2*  e  L   denote  the  map  which  maps 
each  V  e  E   to  Q. 
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Example .    In  constant  propagation,  L^  can  be  taken  to 
be   VALUES  u  {n,aj}  ,   where  VALUES  is  a  set  of  relevant  run  time 
values  that  can  be  acquired  by  the  variables  in  Z,  where 
(ij  is  the  smallest   element  of   L^   indicating  more  than 
one  possible  value,  and  where  the  meet  of  any  two  different 
elements  of  VALUES  yields  o)  • 


(Note  that  in  this  example,  smaller  elements  of  L^ 
correspond  to  less  specific  attributes,  and  that  meet  in 
L   corresponds  to  taking  the  logical  'or'  of  the  predicates  on 
on  program  states  describing  the  respective  elements  of  Lq . 
This  is  compatible  with  standard  data  flow  notation 
(as  in  [KU]  ,  [Ro. ]   etc.  )  )  . 

To  define  the  set  F  of  data  propagation  maps  associated 
with  L,  we  proceed  as  follows:   With  an  instruction  I 
having  the  form 

V  :=   op  (V^,V2,  .  .  •  /Vj^) 

in  the  program  to  be  analyzed  we  associate  a  'shadow' 
instruction  I,  describing  the  way  in  which  the  attribute 
of  the  output  variable  V  depends  on  the  attributes  of 
the  input  arguments   ^i  ' • • • '  \  •   This  I  will  have 
the  form 


a  :=  A  (a.  ,02/  *..,  ct^^)  , 

where  a^  ,  . . .  .a,     e  L.   are  the  assumed  attributes  of 
1       k     0 


Vw  .  .  .  ,  V.   respectively,  and   a  s  l^   is  the  resulting 
attribute  of  V.   A   is  assumed  to  be  a  monotone  map  from 
L^   to  L    (i.e.  monotone  in  each  of  its  arguments). 
We  then  associate  a  function   H  :  L  -»■  L  with  I  as  follows: 
For  each  x  £  L,  W  e  Z,  define 

Aj(x(V^)  ,.., ,x(V^) )    if   W  =  V 
x(w)  if   W  7^  V 


Hj(x)  (W)  = 


(This  formula  propagates  information  through  an  instruc- 
tion from  input  occurrences  to  output  occurrences,  and  is 
therefore  appropriate  for  data  flow  problems  in  which 
information  is  to  be  propagated  in  the  direction  of 
execution  flow  (called  problems  of  the  forward  type), 
which  are  the  only  problems  considered  in  this  paper. 
However,  problems  involving  propagation  in  the  reverse 
direction  of  execution  flow  can  be  treated   by  methods 
like  those  described  below,  even  though  treatment  of  such 
backward  problems  usually  requires  construction  of  additional 
maps  which  modify  the  attributes  of  input  variables 
as  well ) . 

Note  that  instructions  with  no  output  variables  are 
transparent  to  the  data  flow  propagation,  and  their  H   map 
can  be  taken  to  be  the  identity  on  L.   Note  finally  that 
Hj  is  monotone  in  L  since  A   is  monotone  in  L^ . 

We  then  define  F  as  the   smallest  set  of  functions 
acting  on  L  which  is  closed  under  functional  compositions 


and  meets   and  contains  the  identity  map  and  all  the 
above  functions  H  . 

Having  constructed  this  framework,  we  can  apply 
the  standard  flow  graph  data  flow  analysis  technique  to  it. 
This  technique  assumes  that  the  program  being  analyzed 
is  represented  by  a  flow  graph  G,   which  is  a  rooted 
directed  graph  whose  set  of  nodes  N  is  the  set  of  all 
program  basic  blocks  (i.e.  single  entry  code  sequences), 
whose  edges  are  of  the  form  (m,n) ,  where  m,n  are  basic 
blocks  such  that  n  can  be  executed  immediately  after  m, 
and  whose  root  (entry  node)  is  the  entry  block  r  of 
the  program  (i.e.  the  block  at  which  execution  starts). 
For  simplicity,   that  is  assumed  not  to  belong  to  any 
cycle  in  G.   In  this  initial  model  we  ignore  interprocedural 
transfer  of  control  (which  will  be  discussed  in  section  5) 
and  thus  assume   the  program  being  analyzed  to  contain  no 
subprocedures .   To  simplify  our  analysis,  we  will  assume 
that  basic  blocks  are  also  single  exit   and  that  the 
last  instruction  in  each  block  is  a  branch  instruction. 

We  can  then  associate  with  each  edge  (m,n)  e  G,  a 

data  flow  mao   f ,    >  defined  as  follows:   Let  m  consist 
(m,n) 

of  the  instructions   I,, I-,..., I-   in  order,  and  put 

f,    V  =  H^   o  H^     °  •  -  •  o  H^ 
(m,n)     I.     I.^  1^ 

This  shows  the  effect  on  L  of  the  advance  of  control  from 
the  start  of  m  to  the  start  of  n.   (Note  that  in  our  model 


f  ,    X  is  independent  of  n,  and  that  the  last  (branch) 
(m,n) 

instruction  has  really  no  effect   on  the  data  flow) . 

Obviously,  f ,    ,  e  F. 
^  (m,n) 

Having  defined  these  maps,  we  look  for  the  (maximal 
fixpoint)  solution  of  the  following  standard  data  flow 
equations  (where  x  e  L   denotes  attribute  data  to  be 
computed  at  the  start  of  the  basic  block   n  €  N) : 


x^  -  f2* 


(2.1) 


^n  =  ^^^m,n)^^m^=  ("^'"^  ^  ^^  '    "  ^  ^-{r} 


It  is  well  known  ([Ki])  that  a  maximal  fixpoint   of 

these  equations  exists,  and  can  be  found  by  successive 

approximation   using  a  variety  of  iterative  techniques, 

such  as  'workpile   propagation'  [Ki]   or  round-robin 

iteration   through  the  nodes  of  N  [HU] .   Note  that  the 

solution  of  (2.1)  only  yields  information  at  basic  block 

entries,  and  an  additional  step  of  propagation  through 

each  basic  block  is  required.   Specifically,  for  a  block 

n  e  N   consisting  of  the  sequence   I,  ,...,!.  of 

instructions,  and  for  each  k  <_  j   we  compute   w^   ,  the 

k 
attribute  data  state  at  the  start  of  I,  ,  using  the 

formula 


(2.2)         w^   =  H^     o  • • •  o  H^  (x  ) 
^k     -^k-1  ^1  '^ 

Then  for  each  input  argument  V  of  I   ,  w   (V)  yields  the 

^     ^k 


attribute  (in  L_ )  ascribed  to  V  immediately  prior  to 

execution  of  I^  •   If  V  is  an  output  variable  of  I,  , 

then  its  attribute  (known   just  after  the  execution  of  I,  ) 

is  H   (w   )  (V)  . 
k   ^k 
It  is  well  known  that  the  maximal  fixpoint  of  (2.1) 

need  not  coincide  with  the  more  accurate  'meet  over  all  paths' 

solution  (see  [He]   where  this  is  demonstrated  for  constant 

propagation  analysis).   Nevertheless,  our  solution  is 

always  safe ,  in  the  sense  that  the  attributes  that  we 

compute  are  always  a  lower  bound  in  L   to  the  attributes 

that  can  be  actually  acquired  at  run  time  at  the 

respective  variable  occurrences.   (See  [Ro, ]  for  a  more 

general   discussion  of  these  issues.)  ^ 

Attribute  flow  analysis  can  be  also  accomplished 

using  a  second  more  efficient  technique,  which  we  will 

call  the  use-definition  propagation.   This  approach  begins 

by  computing  a  use-definition  chaining  map  ' ud ' ,   which 

is  defined  as  follows  [Al, ] :  for  each  use  (input  occurrence) 

VO  of  a  variable  V  e  E,  define  ud{VO,}  to  be  the  set  of 

all  definitions  or  modifications  (output  occurrences)  of  V 

from  which  VO^  can  be  reached  along  an  execution  path 

which  is  free  of  any  other  definitions  or  modifications  of  V. 

This  map  can  be  computed  by  performing  a  standard  reaching 

definitions  data-flow  analysis  ([He],  [AU] ) .   Since  this  analysis 

involves  a  very  simple  framework,  it  can  use  any  one  of  several 
efficient  data  flow  algorithms  ([HU],  [AC],  [GW] ,  etc.). 


Once  having  computed  the  ud  map,  we  can  perform 
attribute  flow  analysis  using  the  following  ud  propagation 
technique:   Let  'attr'  denote  a  map  sending  each  variable 
occurrence  in  the    program  being  analyzed  to  its  ascribed 
attribute  (m  Lq ) •   This  map  should  satisfy  the  following 
set  of  equations 

(i)   attr(VO)  =  A{attr(VO'):  VO '  e    ud{VO}} 

for  each  variable  use  VO 
(2.3)      (ii)  atLr(VO)  =  A^ (attr (VO^) , . . . ,attr (VO  ) ) 

for  each  variable  modification  (definition) 
VO  appearing  as  an  output  variable  of  an 
instruction  I,  having  VO, , . . . ,V0,  as 
input  variable  occurrences 

Equations  (2.3)  can  be  solved  iteratively   by  successive 
approximation  to  obtain  a  maximal  fixpoint   solution, 
which  exists  since  these  equations  are  monotone  in  'attr', 
viewed  as  an  element  of  the  (well  founded)  lattice  of  maps 
from  variable  occurrences  into  L  . 

To  solve  these  equations,  one  can  use  either  the  workpile 
propagation   technique  of  [Ki],  or  a  round-robin  iteration 
as  in  [HU] ,  with  the  'attr'  map  initially  set  to  map 
all  variable  occurrences  to  fl.   Note  that  a  program  will 
contain  variable  defining  instructions  having  no  input 
(variable)  arguments,  such  as   read  instructions  or  operations 
on  constants.   The  A^  function   corresponding  to  such  an 
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instruction  I  will  then  be  constant,  so  that  the  attribute 
of  the  output  occurrence  of  I  will  be  constant  independent 
of  any  attribute  propagation.   If  the  workpile   propaga- 
tion solution  technique  is  used,  then  propagation  should 
start  at  the  output  occurrences  of  all  these  instructions; 
if  a  round  robin  iteration  is  used,  no  special  treatment 
need  be  given  to  these  instructions.. 

Remark.   Equations  (2.3)  (i)  reflect  the  assiimption  that 
the  attribute  of  a  variable  V  is  not  changed  between  a 
point  at  which  V  is  defined  and  a  subsequent  point  at 
which  V  is  used,  so  that  the  attribute  computed  for  the 
definition  of  V  can  be  propagated  directly  to  the  use. 
This  assumption  will  fail  for  some  relational  attribute 
flow  analyses,  in  which  the  attributes  being  sought  are 
actually   relations  between  variables.   A  typical  example 
of  this  sort  is  found  in  Kaplan  [Ka,  p.  14],   who  discusses 
value  flow  analysis  (cf.  [Sc, ] ) .   This  example  contains 
roughly  the  following  code: 

(1)  read  y; 


(2)    w  :=  {y}      (3)   x  :=  {w} 


Suppose  here  that  the  attribute  map  that  we  wish  to  compute 
is  'crpart'  which  maps  each  variable  occurrence  VO  to  the 
set  of  all  prior  output  occurrences  whose   value  might  become 
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part  of  the  value  of  VO.   Such  a  relational  attribute  map 
need  not  satisfy  the  above  assumption.   Indeed  in  this 
example  there  exists  a  path  (2,1,3)  from  the  definition 
of  W  at  (2)  to  its  use  at  (3),  such  that   the  'crpart' 
attribute  of  W  changes  along  this  path,  because 

crpart (W  at  (2))  =  {y  at  (1)}, 

but 

crpart (W  at  (3) )  =0 

In  what  follows,  we  will  restrict  our  discussion  to  attribute 
flow  analyses  in  which  variable  attributes  do  not  change 
lonless  the  variable  is  modified.   It  is  interesting  to  note 
that  this  difficulty  is  missed  in  [Sc  ] . 

A  similar  situation  can  also  arise  in  cases  where  program 
variables  can  be  aliased  to  each  other,  either  by  passing 
procedure  parameters  by  reference,  or  by  using  pointer  variables, 
In  such  cases  a  variable  attribute  may  change  due  to  a  modifi- 
cation of  an  aliased  variable.   We  will  therefore  also  assume 
that  no  variable  aliasing  takes  place  in  the  program  being 
analyzed.   When  such  aliasing  does  occur,  the  use-definition 
propagation  technique  must  be  used  with  caution,  modifying  it 
to  take  into  account  all  possible  aliases  of  variables,  and 
even  then  it  may  fail  to  produce  as  sharp  a  solution  as  the 
flow-graph  technique. 
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3.    Equivalence  of  the  flow  graph  and  the  ud  propagation  methods 

In  this  section  we  show  that  the  two  attribute  propaga- 
tion techniques  described  in  section  2   yield  identical 
solutions  when  applied  to  attribute  flow  analyses  for 
which  our  assumption  concerning  attribute  preservation 
between  variable  definitions  and  subsequent  uses  is 
valid. 

To  facilitate  the  proof  of  this  claim,  we  first 
convert  the  flow  graph  approach  into  another  approach, 
somewhat  closer  to  the  ud  propagation  technique,  as  follows: 
Rather  than  partitioning  the  given  program  into  basic 
blocks,  we  partition  it  into  single  instruction  blocks  » 

to  obtain  a  new  and  larger  flow  graph   G   whose  set  of 
nodes  N    consists  of  all  program  instructions,  whose 
entry  node  I^  is  the  entry  instruction  of  the  program 
(or  procedure),  and  whose  edges  are  of  the  form   (I, J) 
where  I  and  J  are  control-adjacent  instructions  (that  is, 
are  instructions  for  which  control  can  be  transferred 
directly  from  I  to  J) , 

We  can  then  use  the  data  flow  framework  (L,F)  in  an 
analysis  of  the  new  flow  graph.   This  calls  for  the  solu- 
tion of  the  following  attribute  flow  equations  (where, 
for  each  I  €  N   we  let  z^  e  L   denote  the  ' attribure  map 
attached  to  the  program  point  which  immediately  precedes  I): 
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z  =  n  e  L  , 
(3.1)        0 

z^=   {Hj{Zj):  (J, I)  e  Gj}   for  each  I  7^  I   . 

(Note  that  H   represents  the  change  in   attributes 
which  takes  place  when  control  passes  from  J  to  any  of 
its  successors;  this  change  depends  solely  on  J  itself.) 
These  equations  can  be  solved  iteratively  by  any  of  the 
standard  propagation  techniques  mentioned  in  section  2 , 
setting  initially  z   =  Q.*    for  each  I  G  N  . 

As  a  first  step  in  our  analysis,  we  show  the  following 


Theorem  3.1.    For  each  n  e  n  ,  x  =  z^   ,  where  I   denotes 

n    I  n 

n 

the  initial  instruction  of  n. 

Proof:   Let  n  e  N  be  the  sequence  J,,..., J,  of  instructions. 
As  in  (2.2),  define 

(3.2)         w^   =  H^     o  .  .  .  o  H^  (x  ) 
'^t     '^t-l  '^n   " 

for  each  t  <  k.   We  claim  that  {w_}^^„   is  a  solution  of 
—  J  JfcN^ 

Equations  (3.1).  Obviously,  w    =  x   =  9.*.      Let   I  e  n   be 

0   ^  -■■ 

an  instruction  which  is  not  the  initial  instruction  in  its 
basic  block.   Then  I  has  a  single  predecessor  J,  and  (3.2) 
implies  that  w^  =  Hj(Wj)  ,  so  that  (3.1)   is  satisfied 
in  this   case.   If  I  =  I   for  some  basic  block  n  7^  r  e  N 


then,  by  (2.1) , 


w^  =  X   =  A 
I     n 


(^(m,n)(V=  (™'")  ^  4 
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But  for  each  such  in,  f ,    .  (x  )  =  H^(w^)  ,  where  J  is  the 

{m,n;   m     J   J 

branch  from  in  to  n.   Hence 

Wj  =  A  |Hj(Wj)  :  (J, I)  e    Gj 

(Note  that  these  J  are  precisely  all  the  predecessors  of  I 
in  G  . )   Therefore  (3.1)  is  satisfied  by  the  w  's,  and  since 
the  z.'s  are  the  maximal   fixpoint   of  (3.1)  we  must 
have 

w   <  z   ,   for  each  J  6  n,.  . 

Conversely,  for  each   n  €  n,  define   x  =  z   .   We  claim 

n 

that  {x„}__„   satisfies  Equations  (2.1).   Indeed,  x  =   z^   =   a    . 
n  n=N  ^  r    I/^ 

Let  n^^rSN.   By  (3.1)  we  have 

^n  =  ^I  =  ^    |«J^^J^=  (J'^n)  ^  ^l} 
n      ^  ■' 

But  an  instruction  J  appears  in  the  above  meet  iff  there 
exists  m  s  N  such  that  (m,n)  e  g  and  J  is  a  branch  instruc- 
tion from  m  to  n  (i.e.  to  I  ) .   Hence 

n 

H_(2^)  =  f,   „^(z^  )  =  f,   „x  (x„) 
J   J      (m,n)   I       (m,n)   m 

so  that 

X   =   A  ^f,^  „^  (x  )  :  (m,n)  G  G 
n       [    (m,n)   m 

which  establishes  our  claim.   Hence  z^   =  x   <  x   for  each 

I     n  —  n 

n 

n  G  N  which  implies  the  assertion  of  our  theorem.      ^  „  j^ 
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Next,  we  utilize  the  special  structure  of  the 
functions  ^^^  }      to  prove  the  following. 


Lemma  3.2.   For  each  I  €  N   and  V  e    Z      we  have 


(3.3)   z^(V)  =  f^    {Hj(2j)  (V):  V  is  modified  in  J  and  there 
exists  a  path  free  of  modifications  of  V 
from  J  to  1} 

Proof:   Denote  the  right-hand  side  of  (3.3)  by  z  (v) .   Let 

I  €  N   and  V  ^  I   be  given,  let  J  e  N^  be  an  instruction 

appearing  in  fcise  meet  in  (3.3)  and  let  (J  =  K-,K_,...,K   =  I) 

L      z  s 

be  a  path  free  of  modifications  of  V.   Then,  by  (3.1)  and 
the  definition  of  the  functions  H   ,  we  have 

z^     (V)   <_  Hj(Zj)  (V) 

z   (V)   <   H   (z   )  (V)   =   z   (V) 


ZjtV)    <_  ti^         {z^         )  (V)   =   z     (V) 
s-1    s-1  s-1 

Hence   z^  (v)  <_Hj(Zj)(V),  so  that   z^  (V)  <_   z^  (V)  . 

Conversely,  assume  that  equations  (3.1)  are  solved  using 


a   v^orkpile-driven  propagation  scheme,  as  in  [Ki]  .   For 

'.J   denote  the  value  of  z 


each  k  >_  0  let  z   denote  the  value  of  z   after  k  propagation 


'•k 
steps  of  this  iterative  process,  and  let  z^  denote  the  corres- 

ponding  value  of  z  .   Then  we  claim  that 
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(3.4)  z^(V)  >_  zj(v)  ,    ieNj,vei:,k>_0. 

The  proof  is  by  induction  on  k.   Initially  we  have 
Zj(V)  =   Q    >_   z°(V)   for  each   I,  V.   Suppose  that  (3.4) 
is  true  for  some  k  >_  0 ,  and  consider  the  (k+l)-st  propaga- 
tion step  which  propagates  data  from  some  instruction  J  to 
one  of  its  successors  I.   Let  ieN,veZ.   Ifl;^! 
then,  by  the  induction  hypothesis, 

zJ"*"^(V)  =  z^(V)  >  z^(V)  >  2^'^^(V) 

If   1  =  1   then 

z'^+^CV)  =  zl^(V)  A  H  iz^)   (V) 
i  i        J   J 

k      ^k      '^k+l 
By  the  induction  hypothesis  we  have  z  (V)  >_  z^  (V)  >_  z^   (V)  . 

ill 

If  V  is  modified  in  J  then  J  appears  in  the  meet  for  the 

^  k     k+1 

expression  z  (V) .   Since   z   ^  z^    and  H_  is  monotone  in  L, 

i  J    J         J 

we  obtain 

H^(z'^)(V)  >H  (z'^'^-'-)  (V)  >  z!^"^-'-(V) 
J  J        J  J  I 

k  + 1        '^  '<■  + 1 

so  that  z    (V)  >  z^^-^(v)  in  this  case.   On  the  other  hand, 

^  ~   i     . 

if  V  is  not  modified  in  J,  then 

H  (z'^)  (V)  =  z^{V)    >  zlj(V)  >  zl^'^-'-(V) 
J   J        J       J       J 

But  each  instruction  K  appearing  in  the  meet  for  the  expres- 
sion z    (V)  must  also  appear  in  the  expression  z_   (V)  , 

J  I  ^ 

because  the  existence  of  a  modification-free  path  from  K  to  J 

implies  the  existence  of  a  similar  path  from  K  to  I ,  obtained 
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by  concatenating  the  edge  (J, I)  to  the  first  path.  Hence 

'^k  +  X       ^k+1 
we  have   z    (v)  >_  z^   (V)  ,  so  that  in  this  case  too  we 

J         I 

have 

z'!^'{v)  >  zr^(v) 

i        i 

This  establishes  (3.4),  from  which  the  lemma  follows  readily. 

Q.E.D. 


The  result  established  in  Lemma  3.2  can  be  expressed 
in  a  form  showing  the  close  connection  between  the  'attr' 
map  of  (2.3)  and  the  map  z,  as  follows.   Introduce 
an  auxiliary  map  'attr',  defined  as  follows:   for  each 
I  S  N   ,  V  £  E   which  is  an  input  argument  of  I,  put 


attr(V^)  =  Zj(V)  , 

(where  V   denotes  that  occurrence  of  V  at  I) ;  if  V  is  an 
output  variable  of  I,  then  define 


attr(V^)  -  Hj^Zj)  (V)  . 

Corollary  3.3.   For  each  I  e  N   and  each  variable  occurrence 
V   in  I  we  have 


attr(Vj)  <_  attr(Vj) 
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Proof.    We  will  show  that  the  map  'attr'  satisfies  Equa- 
tions (2.3).   Since  'attr'  is  the  maximal  fixpoint  of 
these  equations   this  will  imply  the  above  inequality. 
Suppose  first  that  V^  is  an  input  occurrence  in  I.  By  (3.3) 
and  the  definition  of  the  ud-map  we  can  write 

attr(Vj)  =  Zj(V)  =  A  |hj{Zj)(V):  Vj  e  ud{V-j.}| 


=  A  {attr(Vj) :  Vj  e  ud{Vj}| 


which  is  precisely  (2.3)  (i)  .   If  V   is  an  output  occurrence 
in  I,  then 

attr(V^)  =H^(Zj)(V)  =  Aj(Zj(V^)  ,...,z^(Vj^)  ) 
(where   V, ,...,V,   are  the  input  arguments  of  I), 

=  Aj(attr(Vj_  ^)  ,...,attr(Vj^  ^)  )  , 

which  is  precisely  (2.3)  (ii)  .   This  proves  our  assertion. 

Q.E.D. 


Next  we  show  that  the  two  maps  'attr'  and  'attr'  are 
identical . 


Lemma  3.4.   For  each  I  e  N   and  each  variable  occurrence 


V   in  I  we  have 


attr(V^)  =   attr(Vj) 
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Proof.   By  virtue  of  Corollary  3.3   it  is  sufficient  to 
prove  that  attr(V  )  <_  attr(V^)  for  each  occurrence  V^ . 
To  do  this,  let   attr'^CV  ),   k  >_  0 ,  denote  the  value  of 
this  map  after  the  first  k  propagation  steps  in  the  itera- 
tive solution  of  Equations  (3.1),   as  in  the  proof  of 
Lemma  3.2.   We  claim  that 

attr'^{V^)  >_  attr(V^) 

for  each  variable  occurrence  V^  and  k  >_  0 .   To  prove  this, 
we  use  induction  on  k.   If  k  =  0  then  clearly 

attr°  (V  )  =   n   >_   attr(V  )  ,   for  each  V   . 

Suppose  that  these  inequalities  hold  for  some  k  >_  0 ,  consider 
the  (k+l)-st  propagation  step,  and  suppose  that  this  step 
propagates  data  from  some  instruction  J  e  N^  to  one  of  its 
successors  I.   Let  I  e  N_  .  If  I  7^  I  then  by  the  induc- 
tion hypothesis, 

^■^\  k4-1  ^^~^  k 

attr    (V-j.)  =  attr  (V^)  >_  attr(V^)  ,   for  each  V^  in  I  . 


If  I  =  I  and  Vj    is  an  input  occurrence  in  I 


then 


/■^\k  +  1  V  +  1  V  k 

attr'^  -^(V^)  =  z^  -^(V)  =  z'^(V)  A  H  (z^)(V) 
i      i  i        J  J 

k        ^ — \  k 

But  z  (V)  =  attr  (V  )  >_  attr(V  )  ,  by  the  induction  hypothesis, 

i  i,        i  ,         , 

If  V  is  modified  in  J,  then  H  (zf) (V)  =  attr  (V,)  and 

J   J  J 

V  e  ud{v  }.   Hence,  by  (2.3) (i)  and  the  induction  hypothesis, 

J      i 

H  (z^)  (V)  >  attr(V^)  >_  attr(V  ) 
J   J  J  i 

x-^  k+1 

so  that  in  this  case   attr    (V_^)  >_  attr(V^).  On  the  other 

__!  I 

hand,  if  V  is  not  modified  in  J,  then  by  (3.4)  and  its  proof, 
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H^(z^)(V)  =  zNv)  >  z'^(V)  >  zhv) 
J   J         J        J        ■'■       . 

As  in  the  proof  of  Corollary  3 . 3  we  can  then  write 
z^(V)  =  A  (^ttr'^(V_):  V^  e  udCv  }| 

i       l^      J   J      J  J 

>  A  jattrCV^):  V^  €  ud{v  }\    =   attr(V^)  . 
-    I       J     J        J  j  ^ 

Finally,  if  V^    is  an  output  occurrence  in  I,  then 
J 

att^r^^^(V^)  =  H^(zl^"^^{V))  =A  (z'^"'^  (V,  ),...,  z*^^^  {V, )  ) 

III         li-^      i 

where  V,  ,...,. V»  are  the  input  arguments  of  I.  But  by  what 

we  have  just  shown  and  by  the  monotonicity  of  A^  ,  we  obtain 

I 

attr   -^(V  )  >_  A_(attr(V   ,),...  ,attr  (V   ))  =  attr  (V  ) 
i      I       1,1  £,i  i 

by  (2.3)  (ii)  .   This  establishes  our  claim  and  completes 

the  proof  of  our  lemma.  Q.E.D, 
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I 

The  following  theorem  now  establishes  the  equivalence 
of  the  flow  graph  and  the  ud  propagation  methods. 

Theorem  3.5.   For  each  I  6  N    let  w  e  L  be  defined 
as  in  (2.2) .  Then 

(i)        If  V^  is  an  input  occurrence  in  I  then  attr (V  )  =  w  (V) 
(ii)       If  V_  is  an  output  occurrence  in  I  then 
attr(Vj)  =  H^(w_j.)  (V)  . 

Proof:   Immediate  frcm  Theorem  3.1  and  Lemma  3.4.  Q.E.D. 

Remark.   A  technique  similar  to  ud-propagation,  propagates 

data  between  vari.able  occurrences  linked  by  a  modified 

data  flow  map         'udu'.    This  map  links  each  use 

of  a  variable  x  to  all  prior  modifications  and  uses  of  x, 

from  which  the  given  use  can  be  reached  along  a  path  free 

of  any  other  occurrences  of  x.   As  noted  in  [Sc  ] ,  this  \ 

approach  is  likely  to  be  more  efficient  than  the  ud  approach 

for   typical  programs.   The  same  arguments  used  to  prove 

Theorem  3.5   can  be  used  to  show  that  this  modified  technique 

also  produces  the  same  output  as  the  flow  graph  method. 
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4 .    Various  properties  of  the  ud  map 

In  this  section  we  comment  briefly  on  various 
pragmatically  significant  properties  of  the  ud  map. 
We  have  already  shown  that  it  can  be  used  to  perform 
attribute-flow  analysis  by  direct  propagation  from 
variable  definitions  to  their  uses,  and  that  this  method 
yields  the  same  results  as  the  standard  flow  graph  based 
method.   To  convince  ourselves  that  use-definition  propa- 
gation is  much  more  efficient  than  flow  graph  propagation 
for  attribute  flow  analysis,  we  note   that  the  space 
required  by  the  flow  graph  technique  is  0(|n|  x  |Z|), 
since  we  have  to  compute  an  attribute  map  in  L  (which  map 
will  require  space  0(|s|))   for  each  node  n  e  N,  even  though 
most  of  the  variables  of  Z  will  not  have  appeared  in  n. 
The  space  required  by  the  ud  propagation  method,  ignoring 
the  space  needed  to  store  the  ud  map  itself,  is  linear 
in  the  number  of  variable  occurrences  in  the  program  and 
for  typical  programs  is  therefore  much  smaller  than  the 
space  required   by  the  flow  graph  method.   In  fact, 

this  space  requirement  is  obviously  the  smallest  possible  for 
such  a  type  of  data  flow  analysis.   Also,  the  space  required 
for  the  ud  map  will  usually  be  linear  in  the  number  of  variable 
occurrences   because  in  typical  programs  the  number  of  defini- 
tions that  can  reach  a  given  use  is  often  very  small  (one  or 
two  o'n  the  average)  ;  and  once  computed,  this  map  can  be  used  in 
all  subsequent  attribute  flow  analyses.   Once  having  computed 
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the  ud  map,  we  can  in  fact  ignore  the  particular  representation 

of  the  program  to  be  analyzed,  and  perform  attribute  flow 

analysis  by  ud  propagation,  which  is  essentially  independent 

of  the  program  representation.   Thus,  the  ud  propagation 

method  can  also  be  used  in  cases  where  the  program  to  be 

analyzed   is  represented  by  its  parse  tree  or  by  other 

alternative  representations,  as  in  Rosen's  high-level  data 

flow  analysis  ( [Ro, ] ;  cf.  also  [MFS]). 

An  additional  disadvantage  of  the  flow  graph  method 

is  that  the  functions   f ,    >  which  it  needs  are  functional 

(m,n) 

compositions  of  rather  complex  functions,  whose  computation 
and  storage  may  be  infeasible  in  practice. 

For  all  these  reasons  the  ud  propagation  technique  is 

preferable  to  the  flow  graph  technique  in  pragmatic 
solution  of  attribute  flow  problems.   Note  however  that 
the  worst  case  behavior  of  the  ud  propagation  method 
may  be  as  bad  as  the  behavior  of  the  flow  graph  method, 
and  even  worse,  due  to  the  overhead  of  the  preliminary 
calculation  of  ' ud ' ;  but  worst  case  programs  are  very 
unlikely  to  occur  in  practice.) 
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5.    Issues  in  interprocedural  ud  propagation 

The  presence  of  procedures  and  procedure  calls  in 
the  program  to  be  analyzed  creates  special  problems  for 
any  data  flow  analysis  technique  (cf.  [Al_],  [R02],  [Ba] , 

[SP] ) .   The  problem  for  the  use-definition  propagation 
method  is  twofold:   (a)  how  to  compute  the  ud  map 
correctly  in  the  presence  of  interprocedural  flow,  and 

(b)  how  to  apply  it  in  attribute  flow  analysis  to  obtain 
sharp  information. 

The  first  problem  has  by  now  been  studied  fairly 
extensively  and  several  methods  which  compute  accurate 
versions  of  the  ud  map  interprocedurally  are  known 

(cf.  [Al_]  for  nonrecursive  programs,  [SS]  for  recursive 
programs  with  no  variable  aliasing;  for  cases  where 
variable  aliasing  can  occur,  one  can  compute  the  ud  map 
by  combining  an  aliasing  analysis  such  as  in  [Ba] 
with  the  interprocedural  analysis  technique  of  [SS], 
but  as  noted  in  Section  2,  we  will  exclude  this  latter  case 
from  our  study) .   Note  that  the  main  issue  here  is  to  establish 
correct  links  between  global  variable  occurrences  across 
procedures . 

The  second  problem  mentioned  above  has  been  less 
studied.   To  see  What  the  problem  is,  consider  the 
following  example  (where  x,  y  are  global  variables) : 
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procedure  P^  procedure  ?_ 

'.  procedure  Q         '. 

(1)  X  :=  1  :       (5)  X  :=  2 

:  (4)  y  :=  X 

(2)  call  Q  :       (6)  call  Q 

*  end;  : 

•  -  ■  • 

(3)  z  :=  y  (7)  z  :=  y 


Any  valid  computation  of  the  ud  map  will  satisfy 


ud{x^}  =  {x^,x^} 
ud{y3}  =  {y^} 
ud{y^}  =  {y^}   . 

Thus,  in  applying  Equations  (2.3)  to  some  attribute  flow 
analysis  (say,  constant  propagation)  we  might  propagate 
attributes  from  x   to  x   ,  then  to  y   ,  then  to  y   and 
finally  to  z^  ,  tracing  an  obviously  spurious  path,  to 
conclude  that  z_  has  no   constant  value,  although  as 
a  matter  of  fact  it  has  the  constant  value  2. 
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This  example  shows  that  even  if   interprocedural  ud 
links  are  computed  accurately,  their  use  in  ud  propagation 
solution   of  an  attribute  flow  analysis  may  lead  to 
inaccurate  (but  nevertheless  overestimated,  and  hence  safe) 
solution.   The  reason  is  that  each  ud  link,  corresponds  to 
some  execution  path  (or  set  of  such  paths)  along  which  this 
link   can  be  realized.   Thus  attribute   propagation   along 
a  series  of  such  links,   using  Equations  (2.3),   amounts 
to  tracing  execution  flow  along   the  concatenation  of 
such  paths  which,   in  the  interprocedural  case,   may 
result   in   an  interprocedurally   invalid   path. 

The   simplist   approach  to  this  problem  is  to 
ingore  it,   use  Equations  (2.3)   as  they  stand,  and  obtain 
overestimated   attribute  information  in  situations  as  in 
the  example  given  above.   However  sharper  information  can 
be  obtained  at  the  expense  of  more  complicated  algorithms. 
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In  what  follows  we  will  suggest  a  possible  approach 
based  on  the  'call-string'  technique  of  [SP] .   As  will  be 
seen  below  this  approach  is  fairly  complicated,  but   never- 
theless interesting  from  a  theoretical  point  of  view.  It 
can  also  be  generalized  to  other  situations  where  execution 
paths  can  be  concatenated  only  in  a  selective  manner,  using 
the  qualified  data-flow  analysis  approach  of  Holley  and 
Rosen  [HR] .   As  indicated  by  their  experience  such  an 
approach  may  still  be  pragmatically  feasible  in  spite  of 
its  complexity,  especially  in  cases  where  sharper  information 
is  desired  and  is  likely  to  have  a  high  payoff   in  the 
optimization  of  the  program.  (This  is  the  case  e.g.  when 
the  effects  of  in-line  procedure  integration  are  to  be 
analyzed  before  actual  expansion  of  the  procedures.) 

We  will  assume  familiarity  with  the  'call-string' 
approach   as   outlined   in   [SP,  Sections  4-6].   Its 
main   advantage   is   that   it   constructs   a     modified 
data-flow  framework  (L*,F*)  and  uses  it  instead  of  a  standard 
framework,  (L,F)  to  set  down  data-flow  equations  which  define 
the  required  solution  in  precisely  the  same  way  as  is  done 
in  a  purely  intraprocedural  data-flow  analysis.   As  was  done 
in  [SP] ,  we  will  first  describe  a  purely  abstract  interpro- 
cedural  use-definition  approach,  which  in  the  general  (recur- 
sive) case  does  not  yield   a  finitely  converging  algorithm, 
but  will  serve  to  set  down  a   framework  in  which  a  certain 
interprocedural  version  of  the  use-definition  chaining  map 
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can  be  defined  and  used  in  a  manner  rather  analogous  to  the 
intraprocedural  ud-approach.   This  framework  with  only  minor 
modifications     will  then  be  used  to  obtain  an  effective  , 
though  approximative,  interprocedural  ud-technique  for  solving 
attribute  flow  analyses. 

We  start  with  a  definition  of  an  interprocedural 
version  of  the  ud  map.  Most  of  our  notations  are  taken 
from  [SP]  .  In  addition,  we  denote  by  IT  the  set  of  all 
variable  occurrences  in  a  program. 

Definition.   We  define  an  interproaedural    use-definition   chain- 
ing  map,    denoted  by   ud*,  as  a  relation  in  II  x  F   (F  being 
the  set  of  all  call-strings),  such  that   ud*{(VO,Y)}  contains 
(VQ.  ,Y,  )  iff  VO  is  a  use  of  some  variable  V  e  S,  VO,  is  a 
definition  of  V,  and  there  exist   execution  paths 
p^  e  lvP(rj^,VO^)  ,  Pq  e  path^  (VO,  ,V0)  such  that 
CM(pj_)  =  Yi  ,  P  s  P^  5  Pq  e  IVP(r^,VO),  CM(p)  =  Y  and  p^ 
contains  no  definitions  (modifications)  of  V.  (Here  we  use 
a  convention  of   not   distinguishing  between  execution  paths, 
leading  from  one  instruction  to  another,  and  graph  paths, 
leading  from  one  basic  block  to  another.   In  fact,  the  notions 
of  interprocedural  validity  and  CM  value   can  easily  be  extended 
to  execution  paths.)   Heuristically ,  we  link  each  variable 
use  VO  and  interprocedural  'flow-summary'  y    tagging  some 
execution  paths  leading  to  VO  to  a  preceding  definition  VO ' 
and  flow  summary   y'   tagging  prefixes   of  these  paths. 
This  will  enable  us  to  filter  out  invalid  concatenations 
of  these  paths. 
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We  omit  here  any  details  on  computation   of  the  ud*  map 
which  may  not  be  effective  if  F  is  infinite  (i.e.  if  the 
program  to  be  analyzed  is  recursive) .   We  will  however 
explore  this  issue  in  detail  later  on. 

As  an  example,-  consider  the  program  given  at  the  begin- 
ning of  this  section,  and  assume  that  both  procedures  P,  and  P_ 
are  called  from  a  main  program  by  the  call  instructions  c',  c" 
respectively.   Then  we  have 

ud*{(x4,(c'2);  }  =  {(xj_,  (c'))} 

where  (c*)/  (c'2)   are  call  strings  denoting  the  sequences  of 
pending  calls  by  which  x,  and  x.  have  been  reached.  Likewise, 

ud*{(x^, (c"6))}  =   {(x^, (c"))} 
ud*{(y3,  (c'))}   =  {{y^Ac'l))} 
ud*{(y^, (c")) }   =  {iY^,{c"6))} 

r 

An  abstract  flow  graph  approach  in  the  interprocedural 
case  which  uses  call  strings  is  described  in  detail  in 
Section  4  of  [SP].   To  obtain  a  corresponding  ud  -approach 
for  attribute  flow  analyses,  our  objective  is  to  compute 
a  map   attr*:  II  x  F  ^-  L^^  ,   which  we  shall  represent  as  a 
map  from  n  into  L*  e  L-.  ,  such  that  for  each  VO  e  n,  Y  ^  F, 
attr  (VO) (y)   is  the  attribute  that  VO  attains  if  execution 
proceeds  along  paths  in  CM   (y)- 

The  attr*  map  should  satisfy  the  following  set  of 
equations  (similar  to  equations  (2.3)): 
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(5.1) 


(i)   attr*(VO)(Y)  =  A  (attr* (VO' ) (Y ' ) :  (VO* , Y ' ) ^ud* ( (VO, Y) } > 

for  each  variable  use  VO  S  n  and  y  ^  ^^ 

(ii)  attr*(VO)(Y)  =  A^  (attr*  (VOj_)  (y)  ,...,  attr*  (VOj^)  (y)  ) 
for  each  variable  definition/modification  VO  S  II 
appearing  as  an  output  occurrence  in  some  instruc- 
tion whose  input  occurrences  are  VO-,,...,VO,  , 
and  each  y  ^    T. 

Initialization  of  an  iterative  process  to  solve  these 
equations  (if  a  workset-oriented  propagation  is  used)  is 
carried  out  in  a  manner  analogous  to  that  mentioned  in 
Section  2,  i.e.  by  initializing  attr*  for  all  output  occur- 
rences VO  of  inputless  instructions  by  their  constant 
attributes.   In  doing  so,  we  can  either  assign  the  corresponding 
attribute  of  VO  to  attr* (VO) (y)  only  for  y  ^    T      for  which 
cm"  (y)  ^  IVP(r,  ,V0)  7^  0  (i.e.  for  which  there  exists  an 
interprocedurally  valid  execution  path  from  the   program's 
entry  to  VO  which  is  'tagged'  by  y) /  o^  else  rely  on  the  defini- 
tion of  the  ud*  map  to  ensure  that  no  propagation  that  uses 
attr*(VO,Y)  for  y's  not  satisfying  the  above  condition  will 
ever  take  place,  as  ud*~  { (V0,y) >  should  be  0  for  such.  y's. 

Applying  equations  (5.1)  to  the  example  program  considered 
above,  it  can  easily  be  checked  that    one  obtains 
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attr* (x. , (c ' 2) )  =  constant  value  1 

attr  (x.,(c"6))  =  constant  value  2 

attr*(y^, (c'2) )  =  1 

attr*(y4,  (c"6))  =  2 

attr* (73,  (c') )  =  attr*(z3, (c') )  =   1 

attr*(y^, (c") )  =  attr* { z^ , (c" ) )  =   2 

so  that  this  technique  enables  us  to  deduce  accurately 

e.g.  that  z_  has  the  constant  value  2,  a  fact  which  would  have 

been  lost  by  using  the  standard  ud-propagation  technique. 

Having  demonstrated  the  usefulness  of  the  more  sophisti- 
cated interprocedural  ud*  approach,  we  now  proceed  to  describe 
and  analyze  this  approach  in  greater  detail. 

As  noted  above,  using  the  approach  outlined  in  Section  4 
of  [SP]  will  yield  an  interprocedural  flow-graph  approach  to 
attribute-flow  analysis,  which  uses  instead  of  the  framework 
(L,F)  defined  in  Section  2,  a  modified  framework  (L  ,F  ), 
where  L*  =  L   =  ^^*a'     '      ^"^  ^*  ^^  defined  as  follows.  Each 
instruction  I  which  is  neither  a  call  nor  a  return  instruction 
induces  a  function  H  €  F  ,  so  that  for  each  x  e  l  ,  y  ^    T 

H*(x*)  (y)  =  Hj(x*(y)  ) 

where  H   is  as  defined  in  Section  2.   If  I  is  a  call  instruc- 
tion (thus,  as  assumed  in  [SP] ,  constituting  a  call  block 
m  e  N   ,  calling  some  procedure  p)  then  we  define 


H*(x*) (Y)  = 


X  (Yt )  /  if  there  exists  (necessarily  a  unique)  Yi 

such  that   Y  =  Yi  °  (m, ,r  ); 
undefined,   otherwise  . 


If  I  is  a  return  instruction  from  some  procedure  p  (thus 
constituting  the  exit  block  e   of  p) ,  then  we  define  a  separate 
function  H,    ,  for  each  J  that  is  an  initial  instruction  of 
a  block  n   which  follows  immediately  a  call  to  p,  so  that 

X  (y-,  )  ,  if  there  exists  (necessarily  a  unique) 


H*^^^j(x*)(T)  =  i 


Y-j_  such  that   Y  =  Yj_  °  (e  ,n  )   ; 


lindefined,  otherwise  . 


Heuristically,  propagation  of  atrribute  data  through  call  or 
return  instructions  does  not  change  any  attributes  but  only 
changes  the  interprocedural  flow  summary  (call  string)  tagging 
each  computed  attribute. 

We  then,  define  F   to  be  the  smallest  set  of  functions 
acting  in  L   which  contains  all  the  above  functions  H  ,  H,    , 
and  the  identity  map,  and  which  is  closed  under  functional 
composition  and  meet.   For  notational  uniformity  we  will  some- 
times regard  all  the  H*  functions  as  edge  functions,  writing 
H,    ,  instead  of  H^  for  all  nonreturn   instructions  I  and 
their  successors  J  as  well. 

We  will  also  extend  the  «  operation  to  apply  to  the 
interprocedural  instruction-flow  graph,  which  we  denote  as 

"k  ic  ^ 

(Nj,Ej,Iq)  ,  where  N^  is  the  set  of  all  instructions  in  the  pro- 
gram being  analyzed,  where  E^  is  the  set  of  all  pairs  of  control" 
adjacent  instructions  (the  transfer  of  control  being  either 
intraprocedural  or  interprocedural)  and  where  I_  is  the  entry 
instruction   of  the  main  program.  We  do  this  by  defining  for 
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Y  °  (I/J)  = 


Y  o  (m,n) ,  if  I  is  a  branch  from  node  m  to  node  n 

whose  initial  instruction  is  J 

Y  ,  otherwise 


This  extended  definition  allows  us  to  extend  the  CM  map  and 
the  notion  of  interprocedural  validity  of  flow-graph  paths 
to  execution  paths  (i.e.  paths  in  the  instruction-flow  graph) 
as  well,  and  to  define  F*  in  terms  of  the  instruction  flow 
graph  alone. 

Having  defined  (L^jF*)  we  can  then  apply  the  analysis 
technique  outlined  in  Section  4  of  [SPJ  to  this  framework 
(noting  that  (L*,F  )  and  (L,F)  bear  the  relationship  assumed 
there), to  define  an  interprocedural  maximal  fixpoint  solution 
x  :  N   -*■  L  .   Likewise,  we  can  also  define  a  similar  solution 
z  :  N   -*■  L   ,  which  is  the  maximal  fixpoint  solution  of  the 
following  set  of  equations  (analogous  to  equations  (3.1)): 

z*   =  {(X,f2*)},   where  X  is  the  null  call  string  and 

n   maps  every  V  e  E   to  f]; 
(5.2)    *         * 

z^   =  A  {H.J  J  (z*) :  (J, I)  e  E^},  for  all  other 

instructions  I . 

Even  though  both  the  x*  solution  and  the  z*  solution   need 
not  be  effectively  computable  in  the  recursive  case,  they  are 
both  well  defined,  by  the  process  described  in  Section  4 
of  ISP].   However,  they  are  effectively   computable  if  the 
program  being  analyzed  contains  no  recursion- 
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We  can  now  carry  out  an  analysis  quite  analogous  to 
that  given  in  Section  3,  which  shows  that  the  ud*  and  the 
flow-graph  approaches   to  attribute  flow  analysis  are  also 
equivalent  in  the  interprocedural  case.   We  begin  with 

THEOREM  5.1.    For  each   n  e  N  ,   x  =  z_   ,  where  I   is 

'    n    I^  '        n 

n 

the  initiial  instruction  of  n. 

Proof;   Completely  analogous  to  the  proof  of  Theorem  3.1, 

with  the  additional  observation  that  for  each  m  e  N  , 

consisting  of  the  instructions  J,,..., J   and  terminating  at 

a  branch  to  some  other  node   n  e  N,  we  have 

*  *  *  * 

f ,    s     =    H-r       °  H^     o  •  •  •  o  H^ 
(m,n)     J,     "^t-l  1 

which  can  be  informally  interpreted  as 

(H_   o  . ••  o  H_  ) *  =  H*   o  . . .  o  H* 

This  is  because    call  strings  tagging  data  at  J,  do  not 
change  ciiring  the  flow  from  J,  to  n  (except  possibly  at  the  branch 
to  n,  which  might  be  a  procedure  call  or  return).  This  observa- 
tion allows  us  to  formulate  the  proof  of  this  theorem  by 
replacing  (L,F)  by  (L*,F*)  in  the  proof  of  Theorem  3.1. 

Q.E.D. 

Next  we  have  the  following  analog  of  Lemma  3.2,  which 
shows  hCTw  variable  attributes  at  a  given  point  depend  on 
preceding  definitions  of  that  variable,  and  which  ensures  that 
only  intarprocedurally  valid  concatenations  of  execution  paths 
are  traced: 
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Lemma  5.2.   For  each  ieN,vei,  y^T, 

z*  (V)  (y)  =   A  ^^H*  (z*)  (V)  (y^)  :  V  is  modidied  in  J,   Y]_  ^  ^ 
and  there  exist  paths  p,  e  IVP(r  ,J) 
(5.3)  n  cm~-'-{Yj_},   Pq  e  path^*(J,I)  such  that 

p^lpQ  G  lvP(rj_,I)  n  cm~-'-{y}   and  p^  is 
free  of  modifications  of  v}. 


Proof;   First  note  that  J  cannot  be  an  interprocedural  jump, 

since  we  assume  that  such  jum.ps  do  not  modify  any  variables. 

The  proof  is  again  analogous  to  that  of  Lemma  3.2  and  goes 

as  follows.   Denote  the  right-hand  side  of  (5.3)  by 

2*(V) (y) •       Let    ISN^,      veZfY^r         be   given,    and    let 

J  s  N  be  an  instruction  appearing  in  the  expression  for 

z* (V)  (y)  .   Let  y^  ^  r,  and  let   p^  =  (J  =  K^,K2, . . .    K^  =  I)  . 

and   Pt  €  IVP{r,,J)  <^   cyT    iy ■<}      be  execution  paths  satisfying 

the  condition  in  that  expression.  Then   if  we  define 

Y2  =  Yi  °  (J,K^) 

Y3  =  Y2  °  (K2,K2) 


^s  =  Yg.i  °  (K^_^,I) 

then  it  is  easily  seen  that  for  all  2  <_  j  <_  s,  y.    is  defined 

and  Y   =  Y  (cf.  e.g.  Lemmas  3.1  and  4.1  of  [SP]).  By  (5.2) 
we  then  have 
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z*(V)(y)    =   z*(V)(Y^)  <  h;^       (z*    )(V)(y 


s-1'"'   ''s-1      ^ 


Hence 


whence 


s-1 


2j(V)  (y)  1  "j(Zj)  (V)  (Yj_) 


*(k) 
To  prove  the  converse  statement  we  show  that  if  z 

* 
denotes  the  value  of  z^  after  k  steps  of  the  iterative  process 

which  defines  that  solution,  and  z^     denotes  the  corres- 

ponding  value  of  z   ,  then 

*(^)  -*(k) 

(5.4)       Zj       (V)  (y)    ^   z-j.         (V)  (y)     ,    for    each   ISN^,    veZ,yeT   and   k    >_   0 

which  we  prove  by  induction  on  k.  As  before,  the  case  k  =  0  is 

trivial,  and  let  us  assume  for  simplicity  that  each  iteration 

* 

step  propagates  information  along  a  single  edge  in  E^.  Assu-ne 

that  (5.4)  is  true  for  some  k  >_  0,  and  let  (J,  I)  be  the  edge 
along  which  information  is  propagated   at  the  {k+l)-st  itera- 
tion.  As  before,  (5.4)  holds  for  k+1  for  all  I  f^  I .  If  I  =  I, 
V  e  Z,  Y  ^  r,   then  we  have  (assuming  a  propagation  mechanism 
like  that  of  Kildall  [Ki]) 
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z*^^'^-^Nv)  (Y)  =    zl'^^Uv)  {y)     A  H*^    (z*    )  (V)  (y) 
i  I  (J, I)   i 

with  no  loss  of  generality,  we  may  assume  that  IVP(r, ,1) 

^  CM~-^{y>  7^   0,  for   otherwise  z*  ^^"^-"-^  (v)  (Y)  =  ^  and  (5.4) 

I 
holds  trivially.   By  the  induction  hypothesis, 

z*^'^^  (V)  (Y)  ^  z!^^^  (V)  (Y)  i  z*^^^^^  (V)  (Y)  .   If  V  is  modified 

I  ^        __  __!  i 

in  J,  then  (J, I)  is  intraprocedural .   It  follows  that  J 

appears  in  the  expression  for  z  (V) (Y)  iff  there  exists  a 

path   p  e  cm"  {y}  (^  IVP(r,,I)  whose  last  edge  is  (J,  I)  such 

that  if  we  write  p  =  p,! (J, I)  then  CM(p,)  =  Y.   If  this  is 

not  the  case,  then  H*     (z*  ^^^  )  (V)  (Y)  =  ^  ,  and  (5.4)  is 

(J, I)   J 
again  immediate.    If  this  is  the  case,  then,  since 

z*  '  '  >  z^    '    and  H  ~  ~   =  H-.  is  monotone  in  L  ,  we  obtain 

(J, I)     J 

H*      (Z*^^^(V)(Y)  >  H*U*^-'^^^  (V)  (Y)  >  2*^'^^^Nv)  (Y) 

(j,i)  J  J  J  i 

so  that  (5.4)  holds  in  this  case  as  well.   On  the  other  hand, 
if  V  is  not  modified  in  J,  let  y,    ^    T      such  that  Yi  °  (J, I)  =  Y 
(if  no  such  y-i  exists  then  the  expression  containing  H   is   f2 
and  the  desired  inequality  is  immediate) .  Then,  by  the  induc- 
tion hypothesis 

H*^  .  (z*^'^^  (V)  (Y)  =  z*^^^  (V)  (Y.)  >  z*^'^^  (V)  (Y-,)  >  z!^'^^^^  (V)  (y,  ) 
(J, I)   J  J  J  J 

(Noting  again  that  by  our  assumption,  CM~  (Vi}  ^    IVP(r,,J)  7^  0.) 
Let  K  e  N^  ,  Y2  ^  ^   such  that  V  is  modified  in  K  and  there 
exist    paths 
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P2  ^  IVP(r^,K)  n  CM"-'-{Y2h   Pq  ^  path  *(K,J)  where 
?■]_  H  p^  Pq  e  IVP(r^,J)  n  cm"  {Yj_}   and  p   is  free  of  modifica- 
tions of  V.   Let  p  =  p^J  (J,i)  =  P2' (Pq  (J/I))-  Then 
P-l (J, I)  is  also  free  of  modifications  of  V  and 
p  e  iVP(r^,i)  n  CM~  iy]    (which  follows  from  Lemma  4.1 
of  [SP] ) .  Hence 

'^*  (k+i)  ,„.  ,   .     ,  r,  *  ,  *(k+i)  ,„.  ,   V   T,  ,1 

2^      (V)  (Yj_)  =  A  {Hj^(Zj^'     (V)  (Y2)  :  K,  Y^  ^s  above} 


>  z:(^^l^V)(Y) 


because,  by  the  argument  given  above,   every  K,  Yo  appearing 
m  the  meet  above  also  appears  in  the  expression  for  z^(V) . 
Hence  we  obtain  (5.4)  also  for  this  case.   Passing  to  the 
limit  in  k  we  obtain  z-(V) (y)  ^  z^(V) (y) /  from  which  the 
lemma  readily  follows.  Q.E.D. 

Continuing  as  in  section  3,  we  next  introduce  a  map 
attr*:  n  -*  L*  ,  so  that  for  each  I  e  n*  ,  y  ^  T  and  V  e  Z 
which  is  an  input  arguemnt  of  I,  we  define 

^tt^*(Vj) (Y)  =  z*(V) (y) 
and  if  V  is  an  output  variable  of  I,  we  define 

attr*(V^) (y)  =  H*(z*) (V) (y) 

Then  we  have 

COROLLARY  5.3.    For  each  I  e  N^  ,  y  ^  T   and  each  variable 
occurrence  V^  in  I  we  have 


attr*(Vj)(Y)  <  attr*(Vj) (y) 
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Proof:   As  before,  it  suffices  to  show  that   attr*  satisfies 
equations  (5.1).   Suppose  first  that  V   is  an  input  occurrence 
in  I.  By  (5.3)  we  have 

attr*(V^)(Y)  =  2*(V)(Y)  =A  |h*  (z*)  (V)  (y-^)  :  J,  Yj_  as  in  (5.3)j 


I 

But  by  the  definition  of  ud  ,  J,  y-i   appear  in  the  above 

* 
expression  iff  (V  ,Yi)  ^  ud  (V  ,y).  That  is 

attr*(V^)(Y)  =   A  rattr*(Vj)  (Y3_)  :  (Vj,Yj_)  €ud*(Vj,Y)} 


J 


which  is  (5.1) (i).   attr   also  satisfies  (5.1) (ii) 
which  is  immediate  from  its  definition. 


Q.E.D. 


LEMMA  5.4.    For  each  I  e  N^  ,    y   e    T      and  each  occurrence  V-j. 
in  I  we  have 


attr*{V^)(Y)  =  attr*(V^) (Y)  . 


Proof;   It  suffices  to  show  that  attr*(V^) (y)  ±   attr*(V^) (y) . 

^''"\  (k) 

To  show  this,  let   attr*    (V  )  denote  the  value  of  this  map 

after  the  first  k  iterations  in  the  process  defining  the 
solution  of  (5.2). 

We  will  show  by  induction  on  k  >_  0  that 


a^t^*^^^ 


(V^) (y)  1  attr  (V^) (y) 


This,  together  with  the  continuity  of  the  functions  H 
(cf.  Lemma  4.2  of  [SP] )   will  imply  the  required  inequality. 
It  suffices  to  consider  the  case  where  CM~  (y)  ^  IVP(r,,I)  j^   0, 
As  before,  the  inequalities  are  trivial   for  k  =  0.  Assume 
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that  they  hold  for  some  k  >_  0 ,  and  consider  the  (k+l)-st 
propagation  step  in  the  solution  of  (5.2),  which  propagates 
data  along  an  edge  (J, I)  s  e^.      For  any  I  ^^  I   the  desired 
inequalities  obviouslv  also  hold  for  k+1 .   If  I  =  1  and  V 

i 

is    some   input  occurrence    in   I    then 

|^*(^+1)  (V-)  (Y)    =    zt^^'^^^V)  (Y)    =    z*^^Uv)[y)    A     H*  (z!^^^(V)(y) 

^  I  I  (J, I)       J 

But 

z*         (V)  (Y)    =   attr*^    MV~Hy)     >    attr*(V^MT) 
I  I  "  I 


by  the  induction  hypothesis.  If  V  is  modified  in  J  then 

h"-  ^  (z*^^^)  (V)  (Y)  =  attr*^^^  (v..)  (y)  ^  attr*  (V.)  (y) 
(J, I)   J  J  J 

and  (V, ,y)  ^  ud  (V^,y)  /  since  (J,i)  is  obviously  intraprocedural , 
J  I 

Hence,  by  (5.1)  (i) , 

attr*(V^)(Y)  >  attr*(V.)(Y) 
J     ~        I 

from  which  the  required  inequality  readily  follows. 

On  the  other  hand,  if  V  is  not  modified  in  J,  then  by   (5.4), 


* 


(k).  ,,r^    r..^      _  ^*(k)  /,r^  /„  ^   .  2*  i^) 


H        (Z*''"'  )  (V)  (Y)   =  Z*''^'  (V)  (y,  )   ^  Z"  '"'  (V)  (Y,  ) 
(J,i)    J  J  J 

>  z~    (V)  (Y) 

~   I 

(if  there  exists  Y-i  such  that  y-i  °     (J/I)  =  y;  otherwise 

an  obvious  inequality  holds) ,   where  the  last  inequality  is 

obtained  as  in  the  proof  of  (5.4) .   As  in  Section  3,  we  obtain 
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(k)  r    /\     (k)  ^  -)  " 

S*  (V)  (Y)    =      A{attr*         {Vj(Y    ):     (V    ,y,)    e    ud    (V^,Y)^ 

1  I  K         2  K       ^  I         J 


>    A    |attr*(Vj^)  (Y2)  :     (^k''^2^    ^    ud*(V~,Y)| 
=      attr*(V,)     ,  by    (5.1)     (i). 

so  that  in  this  case  we  also  have   attr*  ^'^'^■^Nv^)  (Y)  >  attr*  (V^)  (Y)  . 

I     ~        I 

Finally,  if  V   is  an  output  occurrence  in  I,  the  inequality 

I 

is  established  using  a  completely  analogous  proof  to  the  one 
given  in  Section   3.  Q.E.D. 

We  can  now  summarize  our  analysis  in  the  following  theorem, 
which  essentially  states  that  the  interprocedural  ud   approach 
yields  the  same  solution  as  the  interprocedural  flow-graph 

approach. 

*  * 

THEOREM  5.5.   Let  I  S  N   belong  to  a  basic  block  n  e  N   and 

be  preceded  in  this  block  by  !,,...,!  . 

(i)    If   V   is  an  input  occurrence  in  I  then 

attr*{V  )(Y)  =  H    o  H       •••  o  h   (x*{y)){V)  ;  for  each  y^T . 
^  ^s      s-1  ^1   " 

(ii)   If  V   is  an  output  occurrence  in  I  then 


* 


attr  (V  )  (y)  =  Hj  °  H^   o  • ' •  o  h^  (x*{y))(V)    ,  for  each  y^T . 

s  1 

Proof:   Immediate,  as  in  Section  3.   (In  addition  we  make  use 

of  the  fact  that  control-flow  within  n  is  strictly  intraproce- 

dural.)  Q.E.&. 
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Remarks . 

(1)  Note  that,  as  in  [SP] ,  a  final  phase  is  usually  required, 
which  combines  all  data  gathered  for  any  variable  occurrence 
into  a  single  value,  by  computing 

attr(V  )  =  A   attr*(V  )(y)  . 

Note,  however,  that  in  the  nondistributive  case   attr(V^) 

need  not  be  equal  to  H^  °  H^   °  H^       • • •  o  h^  fx'cv)], 
■^        I     I     I,  I-'-n-'' 

s     s-1  1 

where  x'   is  as  defined  in  formula  (4.2)  of  [SP]  (V--   being 
an  output  occurrence) ,   but  may  constitute  a  better  solution 
(simple  examples  using  e.g.  constant  propagation  with  the 
classical  counterexample  to  distributivity  (cf.  [He]) 
can  be  constructed  to  show  strict  inequality  between  these 
two  solutions) . 

(2)  The  remark  made  at  tne  end  of  Section  3  still  applies 
with  obvious  rewording,  to  the  abstract  interprocedural  case 
discussed  above. 

We  next  extend  and  modify  the  methods  devised  in  this 
section  so  far,  to  derive  an  interprocedural  use-definition 
technique,  which  replaces  the  set  T   of  all  call  strings  by 
some  finite   approximation  F,  so  that  it  becomes  a  convergent 
implementable  algorithm.   Our  treatment  here  is  rather  general, 
and  concerns  itself  v/ith  establishing  the  equivalence  betv/een 
the  use-definition  approach  and  its  flow-graph  counterpart. 

Our  notations  and  assumptions  in  this  section  follov;, 
though  in  rather  simplified  manner,  those  of  Section  6  of  [SP] . 
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Let  r,(j,*  and  ECS  be  as  defined  there,  where  T   denotes  a  semigroup 
of  approximate  call-strings,  where  oj  and  *  denote  the  identity 
element  and  binary  operation  for  T,    and  where  ECS  maps  each 
procedure  p  in  the  program  to  the  set  of  all  approximate 
strings  which  can  tag  execution  paths  leading  to  the  entry  of  p. 
We  note  that  instead  of  the  operation   o   we  can  define  a 

*    r  ^  r  * 

map   R:  E   -^  2     ,  so  that  for  each  (m,n)  e  E  ,  R,    \  is 

\Tu  f  Tl ) 

a  relation  in  r,  defined  so  that  a-,R,    ,  a-  iff   a^  e  a-,  o  (m,n). 
Then  the  map  CM  can  simply  be  defined  as 

^^k-l'^k''   ^^k-2'^k-l''        ^^l'^2^ 
for  each  path  q  =  (s,  =  r,  ,  S2  /  ...»  s,  )  .  The  set 
IAP(r, ,n) ,  n  e  N   ,  can  then  be  defined  as 

{q  e  path  ^  (r,  ,n)  :  "CM(q)  -^   0} 
G    -L 

Note  that  the  data-flow  framework  can  be  constructed  using 
only  the  relation-map  R;  in  particular  we  have  for  each 
(m,n)  e  E*,  C  e  L*,  a  €  f 

^(m,n)(?)^'^)  =  ^  {^m,n)(^(^l))=  «1  ^(m,n)^ 

In  addition,  one  needs  to  know  the  'initial'  element  w  of  F 
(which  tags  the  null  information  at  the  entry  node)  in  order 
to  define  the  associated  data-flow  problem  (cf.  (6.2)  in  ISP]). 
This  observation  enables  us  to  view  the  approximative  approach 
of  Section  6  of  [SP]  as  a  special  case  of  a  generalized 
interprocedural  framework,  of  which  the  abstract  framework 
discussed  in  the  previous  section  is  also  another  special  case, 


44 


realized  by  taking  r  =  T,  o)  =  A  (the  null  call  string)  ,  and 
defining,  for  each  (m,n)  e  E 

R-   n)  ^  I  ^^'  '''  "  ^"^''^^^^  Y  ^  r  I  Y  <»  (m,n)  is  definedj- 

i.e.  in  this  particular  case  R,    ,  is  a  partially  defined 

function,  so  that  CM(q)  is  the  singleton  {CiM(q)}  if 

* 
q  s  lVP(r,,n)   for  some  n  €  N  ,  and   is  0  otherwise.   These 

notations  can  be  viewed  as  a  special  case  of  those  used  in 

the  qualified  data-flow  analysis  technique  of  Holley  and 

Rosen  [HRJ . 

We  therefore  proceed  to  generalize  the  analysis  performed 
earlier  in  this  section  to  the  case  of  an  arbitrary  choice  of 
r,  u)  and  R,  and  show  that  in  this  more  general  case  we  also 
have  equivalence  of  the  flow-graph  approach  and  the 
use-definition  approach. 

Not  choosing  to  repeat   the  whole  analysis  once  more,  we 
only  comment  on  the  main  modifications  needed  to  carry  out  this 
generalization  of  the  preceding  analysis:   Replace  the  map 
IVP  by  lAP,  equalities  of  the  form  CM{p)  =  y  by  relations 
of  the  form  y  ^  CM(p) ,  r   by  r  ,   X  by  u  .   Then,  in  the 
definition  of  the  H   functions,  define  H   ,  for  each  call 
instruction  I,  as 

h;(x*)(y)  =   a{x*(y_^):  Yi  e  r  I  Yi  ^n.^,^^)  ^} 

where  I  calls  p  and  m^  denotes  the  block  consisting  of  I. 
Similar  redefinition  of  the  H  map  for  procedure  return  jumps 
is  used. 
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We  assume  that  for  each  intraprocedural  edge  (I, J)  e  E* 
R,  ^.     is  the  identity  relation  on  r.   Using  this  assumption, 
we  can  prove  the  appropriate  variant  of  Theorem  5.1  in  a 
completely  analogous  manner   to  the  original  proof. 

In  the  first  part  of  the  modified  proof  of  Lemma  5.2, 
we  argue  that  it  follows  from  the  definition  of  CM  that 
there    exist      Yo' Y3' •  •  • 'T^.  =  Y   such  that 

In  the  chain  of  resulting  inequalities  we  then  have,  e.g. 

zJ^(V)(Y3)  l«tK2,K3)(%)(^)(^3^  =  ^  '  {%  ^^^  (^3^  =  ^3^K2,K3)  ^3} 

-  ^K2^^^^^2^  (Si^^^  ^2  ^(K2,K3)  ^3^ 

etc. 

Making   similar  modifications  in  subsequent   proofs  given 
earlier  in  this  section, we  can  obtain  an  appropriate  generalization 
of  the  preceding  analysis  which  establishes  the  equivalence 
between  our  two  attribute-flow   analysis  methods  in  the  general 
case  as  well.   Details  are  left  to  the  reader. 

The  only  remaining  issue  is  how  to  compute  the  generalized 
ud   map,  so  that  sharper  interprocedural  ud-based  analysis  can 
be  performed.   We  now  address  ourselves  to  this  problem. 

A  first  possible  technique,  which  is  straightforward,  but 
not  very  pragmatic,   is  to  introduce  a  special  data-flow 
problem  which  can  be  interpreted  as  an  interprocedural  exten- 
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sion  of  the  classical  'reaching  definitions'  analysis  (cf.  [HE], 
e.g.).  The  ud   map  can  then  be  derived  from  the  solution  of 
that  analysis  in  much  the  same  way  as  the  standard  ud  map 
is  derived  from  the  solution  of  the  reaching  definition  problem. 
This  is  done  as  follows.  ^ 

n  xr 

Assume  that  F  is  chosen  to  be  finite.   Let  L  =  2      where 
n   is  the  set  of  all  (global)  variable  definitions/modifications 
in  the  program  to  be  analyzed.   More  precisely,  as  will  be 
seen  below,  re  require   elements  of  L  to  contain  only  pairs  of 
the  form  (VO,y)»  where  VO  s  IT   is  a  definition  occurring  at 
some  procedure   p  of  the  program,  and   y   e  ECS(p),  i.e.  y  can 
tag  some  execution  path  leading  to  VO  from  the  program  entry 
(here,  with  no  loss  of  generality,  we  make  an  implicit  assumption 
that  all  instructions  within  a  procedure  can  be  reached  from  its 
entry).   Next,  we  define  a  new  flov/  graph  (N,E,r^)  where 

N  =  N   X  T; 

E   =  {(n^Yi,  n2Y2):  (ni,n2)  e    e*    and  y^   R(n^,n2)  ^2 

^0  =   (^main''^^ 


For  each   (n,  Yt  /  ^2*^2^  ^  ^  ^^^   each  x  s  l  we  define 
^(n  Y   n  Y  )  ^^^  "  ^  ^  (nokill  (n^  ,n2)  x  ?)  u  [leave  (nj_  ,n2)  >={Yj_}) 


where   nokiIl(n,,n2)   is  the  set  of  all  variable  definitions 
whose  variables  are  not  modified  as  control  advances  from 
n,  to  n2  ,  and  leave (n^jn-)   is  the  set  of  all  variable 
definitions  v/ithin  n.  which  can  reach  the  start  of  n2.  We  define 
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F  as  the  smallest  set  of  functions  acting  in  L   which  contains 
all  those  functions  and  the  identity  and  which  is  closed  under 
functional  compositions  and  meets.   Obviously,  F  is  distri- 
butive. 

We  can  now  associate  the  following  data-flow  problem 
with  (L,F)  and  the  new  flow  graph:   Compute  a  minimal  fixpoint 
solution  x:  N  ^  L  of  the  set  of  equations 

(5.5)  xCr^)  -  0 

x(ny)  =  U  {f(„^^^,^^)(x(n^Y^)):  (n^y^,nY)  e   e} 

We  have  thus  defined  our  data-flow  problem  using  standard 
notation,  so  that  we  can  apply  standard  techniques  to  solve  it 
iteratively   and   obtain  a  solution  which   satisfies  by 
Kildall's  theorem  [Ki] 

1 


xCny)  =  u  -jf  (0):  p  e  path^  (r^ ,  ny)  , 


The  following  can  then  easily  be  checked. 

LEiMiMA  5.6.   Let   p  e  path^lr^/ny)  .   Write  p  as 

(^^main'^^'  ("2'Y2^'  ("a'^a^'  ••"  ^"s-l'^s-1^'  ^^'Y))- 
Then 

(a)  the  path  p  =  (^niain'^2 '"3 '  '  •  •  ' '^^  ^  ^^^^^main'"^  ^^'^ 
y  e  CM(p)   (a  similar  statement  obviously  also  holds 

for  each  initial  subpath  of  p) 

(b)  (VO',y')  e  f  (0)  iff  3  k  <  s  such  that  y,  =  y'  and 

P  '^ 

VO*  e  leave(n,  ,n,  ,)   and  also  e  nckill  (n  .  ,n  ._|_j^)  , 
j  =  k+1, . . . , s-1. 
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COROLLARY  5.7.   For  each   (n,Y)  e  N,  x(n,Y)  is  the  set  of 
all  pairs  (VO',7')  such  that  VO '  is  a  definition  of  some 
variable  V   for  which  there  exist  paths 

p'  e  IAP(r^,VO')  n  CM~-'-{Y'}f   Pq  ^  path  ^(VO',n)   such 
that   P''Pq  ^  IAP(r,,n)  n  CM~  {y}   and  V  is  not  redefined/ 
modified  along  p^^. 

The  last  corollary  implies  that  the  ud  map  can  be 
computed  from  t±ie  solution  x  in  a  manner  completely 
analogous  to  the  way  in  which  the  intraprocedural  ud  map 
is  computed   from  the  solution  to  the  reaching  definitions 
analysis. 

Having  described  this  basic  algorithm,  we  will  now 
suggest  some  possible  improvements. 

First  we  observe  that  the  R  relations  are   nontrivial 

only  for  interprocedural  edges.   This  means  that  we  can 

break    our  analysis  into  an  intraprocedural  phase  followed 

by  a  rather  compact  interprocedural  phase,  thereby  gaining 

considerable  efficiency.   Specifically,  we  proceed  as 

follows:   At  each  procedure  entry  r   and  immediately  after 

each  procedure  call  c   (a  point  denoted  as  c  )  ,  we  insert 

dummy  definitions    of  each  global  variable  V  (denoted  as 

V   and  V    respectively) .   Similarly,  before  each  procedure 

return  e   and  just  before  each  procedure  call  c  (a  point 

p  • 

denoted  as  c  )  ,  we  insert  dummy  uses   of  each  global 

variable  V  (denoted  as  V   and  V   respectively)  .  Having  done 

^P 
this,  we  subject  each  procedure  p  to  a  standard  intraprocedural 
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ud  computation  analysis.   Let  ud   denote  the  union  of  all 
resulting  maps  (taken  over  all  procedures  in  the  program). 

We  next  compute  our  ud  map  as  follows:  First  introduce 
a  map  REACH  so  that  for  each  global  variable  V,  each 
instruction  i  which  is  either  a  procedure  entry,  an  exit, 
a  pre-call,  or  a  post-call,  and  each  y  s  r  , 


REACH(i,Y/V)  =  {(VO',y')  6  x(n.,Y)  such  that  var(VO')  =  V} 


where  n.  is  the  block  containing  i. 

The  following  lemma  can  be  shown. 

LEMMA  5.8.    The  map  REACH  satisfies  the  following  set  of 


equations : 

(i)    For  each  procedure  entry  r   ,  y  ^  T,  V  a  global 
variable  , 
REACH(r  ,Y,V)  =  U{reACH  (c_ ,  Y-[_,  V)  :  c_  is  a  call  to  p, 

^1  ^(c,rp)  ^^ 
(ii)  For  each  post-call  point  c   ,  where  c  calls 

,,.  ,,         procedure  p,   Y  ^  T  and  V  global, 

REACH(c^,Y,V)  =  u{REACH(e  ,Yi,V)  :  Y;i_  R(3  ^^  )  y) 

(iii)  For  each  pre-cali  point  or  an  exit  J,   which  is 

in  some  procedure  q,  each  Y  ^  ECS{q}  and  a  global  V 
REACH(J,Y,V)  =   {(V^,y):  V^  e  udCVj)  and  I  is  not 

a  post-call  or  an  entry  (i.e.  V   is 

not  dummy) 

U  {REACH(I,YfV) :  V  ^udCv  }   and  I  is  a  post-call  or 

an  entry} 
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Proof:   (i)  and  (ii)  are  obvious  from  the  equations  defining 
the  map  x.    (iii)  is  an  easy  consequence  of  Corollary  5.7, 
where  we  also  note  that  the  node  (J,y)  is  reachable  from 
the  entry  node  in  our  modified  flow  graph  iff  y   S  ECS{q}, 
according  to  the    special  nature  of  the  R  relations.      Q.E.D. 

We  thus  apply  the  equations  of  Lemma  5.8  iteratively 
to  obtain  a  minimal  fixpoint  solution.    It  can  also 
be  checked  that  this  solution  indeed  coincides  with  the  definition 
of  REACH  in  terms  of  the  map  x.   Then  the  interprocedural  ud* 
map  can  be  computed  as  follows: 

For  each  use  VO  G  n   of  a  variable  V,  occurring  in 
some  procedure  q,  and  each  y  e  ECS{q},  we  have 

ud*(VO,Y)  =  {(VO",y):  VO '  e  ud{VO}  and  is  not  dummy} 

(5.7)  ( 

u  •^REACH(J,Y,V)  :  J  is  a  post-call  or  the 

entry  of  q  and  V^  e  ud{VO}[. 

"J  J 

We  omit  here   a  proof  of  this  formula. 

To  conclude,  we  propose  a  four-phase  procedure  to  compute 
the  ud*  map: 

(a)  Perform  intraprocedural  computation  of  the  ud  map  for 
each  procedure  in  the  program,  annotated  with  dummy  defi- 
nitions and  uses  of  relevant  global  variables  as  explained 
above. 

(b)  Compute  the  ECS  map,  using  e.g.  equations  (6.1)  of  [SP] . 

(c)  Compute  the  REACH  map,  by   equations  (5.6). 
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(d)    Compute  the  ud   map,  by  formula   (5.7)  given  above. 

This  algorithm  is  rather  appealing  as  it  can  be  viewed 
as  an  extension  of  any  existing  intraprocedural  ud  computation 
algorithm.   It  is  easy  to  check  that  the  same  algorithm,  with 
some  slight  modifications  can  be  applied  for  the  computation 
of  the  BFROM  map,  referred  to  in  an  earlier  comment. 
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